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AMENDMENTS TO THE CLAIMS 

1 . (Currently Amended) A computer-based system for creating from a dataset a data table 
comprising sequence identifiers corresponding to a targeted collection of sequences from a 
datase t. the dataset comprising sequence identifiers corresponding to natural com p lex 
biopolymer sequences and linked to corresponding annotations, the system comprising: 

a) a search function which searches the annotations of the dataset according to a user- 
defined criterion and outputs a first subset of the dataset restricted by the criterion; 

b) a redundancy reducing function which compares the first subset with a first database 
correlating the sequence identifiers of the first subset with syngeneic common source gene 
biopolymcrs and outputs a second subset of the dataset having reduced unique , natu r al com p lex 
biopolymer redundancy relative to the first subset; 

c) a selection function which applies to the second subset a user-deiined selection 
parameter and outputs a third subset of the dataset restricted relative to the second subset by the 
parameter; and 

d) a tabulation function which creates and outputs the targeted collection of sequences in 
the form of a data table comprising, configurable by and sortable by the sequence identifiers of 
the third subset. 

2. (Original) A system according to claim 1, wherein the criterion is selected from the group 
consisting of a keyword and a concept. 

3. (Original) A system according to claim 1, wherein the criterion is one of a plurality of user- 
defined criteria, and the search function searches the annotations of the dataset according to the 
criteria and outputs a first subset of the dataset restricted by the criteria. 

4. (Original) A system according to claim 1 , wherein the criterion is one of a plurality of user- 
defined criteria, and the search function searches the annotations of the dataset according to the 
criteria and outputs a first subset of the dataset restricted by the criteria, wherein the criteria 
include multiple keywords. 

5. (Canceled) 
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6. (Original) A system according to claim 1 , wherein the dataset is one of a plurality of datasets, 
and the search function searches the annotations ofthe datasets according to the user-defined 
criterion and outputs a first subset ofthe datasets restricted by the criterion. 

7. (Canceled) 

8. (Currently Amended) A system according to claim 1, wherein the database is one of a 
plurality of databases correlating the sequence identifiers of the first subset with syngeneic 
common source gene biopolymcrs, and the redundancy reducing function compares the iirst 
subset with the databases and outputs the second subset of the dataset. 

9. (Currently Amended) A system according to claim 1, wherein the parameter is selected from 
the group consisting of source, species, author^ and pathway. 

10. (Original) A system according to claim 1 , wherein the parameter is one of a plurality of 
user-defined selection parameters, and the selection function applies to the second subset the 
parameters and outputs the third subset restricted relative to the second subset by the parameters. 

1 1 . (Currently Amended) A system according to claim 1 , wherein the redundancy reducing 
function outputs a second subset ofthe dataset which eliminates unique , natural complex 
biopolymcr redundancy relative to the first subset 

12. (Original) A system according to claim 1, further comprising an expansion function which 
searches a second database for synonyms of the sequence identifiers of the first, second or third 
subset. 

13. (Currently Amended) A computer-based method for creating from a dat aset a data table 
comprising sequence identifiers corresponding to a targeted collection of sequences from a 
dataset the dataset comprising sequence identifiers corresponding to natural complex 
biopolymer sequences and linked to corresponding annotations, the method comprising 
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computer-implemented steps of: 

a) searching with a computer the annotations of the dataset according to a user-defined 
criterion and outpuUing a first subset of the dataset restricted by the criterion; 

b) comparing with the computer the first subset with a database correlating the sequence 
identifiers of the first subset with syngeneic common source gene biopolymers and outputting a 
second subset of the dataset having reduced uniqu e, natural complex biopoiymcr redundancy 
relative to the first subset; 

c) applying to the second subset a user-defined selection parameter and outputting a third 
subset of the dataset restricted relative to the second subset by the parameter; and 

d) creating and outputting the targeted collection of sequences in the form of a data table 
comprising, configurable by and sortablc by the sequence identifiers of the third subset 

14. (Currently Amended) A computer-based system for creating from a plurality of datascts a 
data table comprising sequence identifiers corresponding to a targeted collection of sequences 
lioin a plurality of datascts . the datasels comprising sequence identifiers corresponding to uatural 
complex biopolymer sequences, the system comprising: 

a) a merge and redundancy reducing function which compares the datasets with a 
database correlating the sequence identifiers with syngeneic common source gene biopolymers 
and creates a subset of the sum of the datasets having reduced unique, natural tompIcA 
biopolymer redundancy relative to the sum; and 

b) a tabulation function which creates and outputs the targeted collection of sequences in 
the form of a data table comprising, configurable by and sortablc by the sequence identifiers of 
the subset. 

15. (Original) A system according to claim 14, wherein the merge and redundancy reducing 
function further comprises a selection function which applies a user-defined selection parameter 
whereby the subset is restricted relative to the sum of the datascts by the parameter. 

16. (Currently Amended) A system according to claim 14, wherein the merge and redundancy 
reducing function further comprises a selection function which applies a user-defined selection 
parameter whereby the subset is restricted relative to the sum of the datasets by the parameter, 
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wherein the parameter is selected from the group consisting of source, author* and pathway. 

17. (Currently Amended) A computer-based method for creating Prom a plurality of datasets a 
data table comprising sequence identifiers corresponding to a targeted collection of sequences 
fro m a plural i ty of dataset s . the datasets comprising sequence identifiers corresponding to natural 
complex biopolymcr sequences, the method comprising computer-implemented steps of: 

a) comparing the datasets with a database correlating the sequence identifiers with 
syngeneic common source gene biopolymcrs and creating a subset of the sum of the datasets 
having reduced unique , natural compl e x biopolymer redundancy relative to the sum; and 

b) creating and output ting the targeted collection of sequences in the form of a data tabic 
comprising, configurable by and sortablc by the sequence identifiers of the subset. 

18. (Currently Amended) A computer-based system for creating from a dataset a data table 
comprising sequence identifiers corresponding to a targeted collection of sequences from-a 
dataset . the dataset comprising sequence identifiers corresponding to natural complex 
biopolymer sequences and linked to corresponding first annotations, the system comprising: 

a) an integration function which merges the dataset with a database comprising second 
annotations attributable to and correlated with at least a subset of the sequence identifiers or 
sequences of the dataset and which links the second annotations to the corresponding sequence 
identifiers of the subset; and 

b) a tabulation function which creates and outputs the targeted collection of sequences in 
the form of a data table comprising, configurable by and sortablc by the sequence identifiers of 
the subset and the second annotations. 

19. (Original) A system according to claim 18, wherein the second annotations comprise data 
attributable to and correlated with at least a subset of the sequence identifiers or sequences of the 
dataset, said data selected from the group consisting of: gene expression data, sequencing data, 
genotype data, polymorphism data and clinical data. 

20. (Currently Amended) A computer-based method for creating from a dataset a data table 
comprising sequence identifiers corresponding to a targeted collection of sequences from a 
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datase t the dataset comprising sequence identifiers corresponding to natural complex 
biopolymer sequences and linked to corresponding first annotations* the method comprising 
computer-implemented steps of: 

a) merging the dataset with a database comprising second annotations attributable to and 
correlated with at least a subset of the sequence identifiers or sequences of the dataset and 
linking the second annotations to the corresponding sequence identifiers of the subset; and 

b) creating and outputting the targeted collection of sequences in the form of a data table 
comprising, configurable by and sortable by the sequence identifiers ofthe subset and the second 
annotations. 

21. (Currently Amended) A system according to claim 1, further comprising: 

a second computer-based system for creating from a plurality of datasets a data table 
comprising sequence identifiers corresponding to a targeted collection of sequences from a 
plu r a l ity of datasets . the datasets comprising sequence identifiers corresponding to natu r al 
complex biopolymer sequences, the second system comprising: 

a) a merge and redundancy reducing function which compares the datasets with a 
database correlating the sequence identifiers with syngeneic common source gene biopolymers 
and creates a subset of the sum of the datasets having reduced unique , na t ural c o mplex 
biopolymer redundancy relative to the sum; and 

b) a tabulation function which creates and outputs the targeted collection of sequences in 
the form of a delta table comprising, configurable by and sortable by the sequence identifiers of 
the subset. 

22. (Currently Amended) A system according to claim 1, further comprising: 

a second computer-based system for creating from a dataset a data table comprising 
sequence identifiers corresponding to a targeted collection of sequences f r om a datase L the 
dataset comprising sequence identifiers corresponding to ualuial complex biopolymer sequences 
and linked lo corresponding first annotations, the second system comprising: 

a) an integration function which merges the dataset with a database comprising second 
annotations attributable to and correlated with at least a subset of the sequence identifiers or 
sequences ofthe dataset and which links the second annotations to the corresponding sequence 
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identifiers of the subset; and 

b) a tabulation function which creates and outputs the targeted collection of sequences in 
Lhe form of a data table comprising, configurable by and soriable by Lhe sequence identifiers of 
the subset and the second annotations. 

23. (Currently Amended) A system according to claim 1, further comprising: 

a second computer-based system for creating from a plurality of datasets a data table 
comprising sequence identifiers corresponding to a targeted collection of sequences from a 
plurality of dataset s . the datasets comprising sequence identifiers corresponding to natural 
complex biopolymer sequences, the second system comprising: 

a) a merge and redundancy reducing function which compares the datasets with a 
database correlating the sequence identifiers with syngeneic common source gene biopolymers 
and creates a subset of the sum of the datasets having reduced unique , natural complex 
biopolymer redundancy relative to the sum; and 

b) a tabulation function which creates and outputs the largclcd collection of sequences in 
the form of a data table comprising, configurable by and sortable by the sequence identifiers of 
the subset; and, 

a third computer-based system for creating a targeted collection of sequences from a 
dataset comprising sequence identifiers corresponding to natural complex biopolymer sequences 
and linked to corresponding first annotations, the third system comprising: 

a) an integration function which merges the dataset with a database comprising second 
annotations attributable to and correlated with at least a subset of the sequence identifiers or 
sequences of the dataset and which links the second annotations to the corresponding sequence 
identifiers of the subset; and 

b) a tabulation function which creates and outputs the targeted collection of sequences in 
the form of a data table comprising, configurable by and sortable by the sequence identifiers of 
the subset and the second annotations. 

24. (Canceled) 
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